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ABSTRACT 

Nitrated polycyclic aromatic hydrocarbons are 
common environmental pollutants, of which many 
are mutagenic and carcinogenic. 1 -Nitropyrene is 
the most abundant nitrated polycyclic aromatic 
hydrocarbon, which causes DNA damage and is 
carcinogenic in experimental animals. Error-prone 
translesion synthesis of 1 -nitropyrene-derived 
DNA lesions generates mutations that likely play a 
role in the etiology of cancer. Here, we report two 
crystal structures of the human Y-family DNA poly- 
merase iota complexed with the major 1 -nitropyrene 
DNA lesion at the insertion stage, incorporating 
either dCTP or dATP nucleotide opposite the 
lesion. Poll maintains the adduct in its active site 
in two distinct conformations. dCTP forms a 
Watson-Crick base pair with the adducted guanine 
and excludes the pyrene ring from the helical DNA, 
which inhibits replication beyond the lesion. 
By contrast, the mismatched dATP stacks above 
the pyrene ring that is intercalated in the helix 
and achieves a productive conformation for 
misincorporation. The intra-helical bulky pyrene 
mimics a base pair in the active site and facilitates 
adenine misincorporation. By structure-based 
mutagenesis, we show that the restrictive active 
site of human poh) prevents the intra-helical con- 
formation and A-base misinsertions. This work 
provides one of the molecular mechanisms for G 
to T transversions, a signature mutation in human 
lung cancer. 



INTRODUCTION 

Urban air pollution increases morbidity and mortality 
rates in human populations (1). One of the main contribu- 
tors to the detrimental health effects of air pollution is 
exposure to nitrated polycyclic aromatic hydrocarbons 
(NPAHs). NPAHs are a group of abundant organic 
chemical pollutants, arising from the combustion of 
carbon-containing agents, such as diesel exhaust, industrial 
emissions and cigarette smoke (2). The toxicity of NPAH 
compounds arises from their metabolic nitro-reduction in 
human cells, creating highly reactive species that react with 
genomic DNA (Figure 1A). 1 -Nitropyrene (1-NP), the 
most prevalent NPAH in the environment, is particulalrly 
abundant in urban air particulate. 1-NP induces mutagen- 
esis (3,4) and apoptosis (5) by forming DNA adducts 
in mammalian cells. 1-NP causes mammary gland tumors 
in experimental animals (6,7). Thus, 1-NP and related 
NPAH compounds are suspected to have a major impact 
on human health, especially in populations living in urban 
or industrial areas. Metabolites of 1-NP covalently bind 
to guanine bases in DNA, forming mainly the 
7V-[deoxyguanosine-8-yl]-l-aminopyrene (APG) adduct 
(Figure 1A) (8,9). The mutagenic signature of the APG 
lesion is the induction of G to T transversions (3). G to T 
transversions are pronounced mutations in lung cancers 
from smokers, and high concentrations of NPAH com- 
pounds in cigarette smoke may be a contributing factor 
to the observed genetic changes (8,10). 

Bulky DNA lesions, such as APG, hinder DNA repli- 
cation carried out by high-fidelity polymerases owing 
to the restrictive active site of these enzymes (11). To 
rescue adduct-stalled replication forks, cells must recruit 
specialized Y-family DNA polymerases, which replicate 
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Figure 1. 1-Nitropyrene and Y-family DNA polymerase activity. (A) Reduction of 1-Nitropyrene to the nitrenium ion and attachment to a guanine 
base of DNA to form APG adducts. (B) Primer extension assays with undamaged G or the APG lesion at the first replication position. Human polr|, 
poltc and poll were incubated with DNA substrates and reacted with no incoming nucleotides (0), all four nucleotides (N) or individual nucleotides 
(A, T, C, G) for either 0.5 min for undamaged G or 30min for the APG lesion. Vertical arrows indicate nucleotide preferences opposite the APG 
lesion, and horizontal arrows indicate stalling bands of APG DNA. DNA substrates are shown above the gels. 



through bulky DNA lesions by translesion synthesis 
(12-14). Although these polymerases alleviate stalled rep- 
lication forks, they are also highly error-prone and induce 
mutations. Y-family DNA polymerases have a finger, 
thumb and palm domain similar to all DNA polymerases, 
and a unique fourth domain referred to as the little finger 
(or polymerase-associated domain (PAD)/wrist) domain 
(15-19). The first three domains pack tightly togther and 
form a catalytic 'core' with a loose connection to the little 
finger domain (20,21). Y-family polymerases have open 
and solvent-exposed active sites, which can accommodate 
distorted and bulky DNA lesions, but are responsible for 
low-fidelity DNA replication (12,15). Multiple Y-family 



polymerases exist in most eukaryotic species, each with 
distinct functionalities (22). Human cells contain four 
Y-family members: Revl, polymerase r| (polr)), polymer- 
ase i (poh) and polymerase k (pohc) (12,13). The last three 
are translesional DNA polymerases that differ in their 
ability to bypass lesions and their fidelity during DNA 
replication. Y-family DNA polymerases are thought to 
be responsible for the mutagenic signature of cells 
exposed to 1-NP. 

To reveal the molecular basis of error-prone replication 
of the APG DNA adduct, we performed functional 
analysis on three human Y-family DNA polymerases in 
APG bypass, determined the structures of human poh in 
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ternary complex with APG-containing DNA at the inser- 
tion stage and extended our structural observations of poll 
toward structural models for polr) and pohc APG DNA 
adduct bypass in humans. 

MATERIALS AND METHODS 

Synthesis of oligonucleotides containing APG 

The protected monomer 2,8-diisobutyryl-8 -(1-aminopyrenyl)- 
5'-0-(4,4'-dimethoxytrityl)-3'-0-[A,A'-diisopropylamino(2- 
cyanoethoxy)phosphonyl]-2'-deoxyguanosine was prepared as 
described (23). It was incorporated by standard DNA synthe- 
sis protocol into the oligonucleotide 5'-TCAG^GGGTCCTA 
GGACCC-3' (where G* = APG). The mass of the 18-mer 
was confirmed by ESI-MS analysis. 

Protein preparation 

Poh protein used for crystallization was expressed and 
purified as previously described (24). Proteins used for rep- 
lication assays (poll 1-430, polr) 1-445, pohc 19-523, poln 
R61A 1-445) contained N-terminal histidine tags and were 
expressed in Escherichia coli and purified by nickel affinity, 
followed by ion-exchange, chromatography. 

DNA preparation 

The APG DNA substrates used for crystallization and 
activity assays were purified using ion-exchange chroma- 
tography (25). For poll crystallization, the self-annealing 
18-mer oligonucleotide containing an APG (G) lesion 
(5'-TCAGGGGTCCTAGGACCC-3') was annealed with 
itself to give a DNA substrate with two replicative ends. 
Undamaged oligonucleotides used for primer extension 
assays were purchased from Keck Oligo Inc. and 
purified by ion exchange. For primer extension assays, a 
12-nt primer (5'-CCCAATACCAGTC-3') was annealed 
to an 18-nt undamaged G or APG template 
(5'-TTGCGGACTGGTATTGGG-3')- For the extension 
assays, a 13-nt primer containing C at the 3'-end (5'-CCC 
AATACCAGTCC-3') or A at the 3'-end (5'-CCCAATAC 
CAGTCA-3') was annealed to either the 1 8-nt undamaged 
template or the 18-nt APG template. Primers were 5'-end 
labeled using [y- 32 P]ATP and T4 polynucleotide kinase 
and annealed to the template DNA substrates. 

Primer extension assays 

DNA substrates (10 nM) were incubated with either poll, 
polr), pok or polr) R61 A (10 nM) and 100 uM of either all 
four dNTPs or individual dNTPs at 37°C in reaction 
buffer containing 40 mM Tris (pH 8.0), 5mM MgCl 2 , 
250 ug/ml bovine serum albumin, 10 mM DTT and 
2.5% glycerol. For the primer extension assays, reactions 
were carried out for ~2min with undamaged DNA and 
~30min for APG DNA. Reaction times for all other 
experiments were indicated below the gels. Reactions 
were terminated with loading buffer (95% formamide, 
20 mM EDTA, 0.025% xylene, 0.025% bromophenol 
blue) and resolved on a 20% polyacrylamide gel contain- 
ing 7M urea. Gels were visualized using a 
Phosphorlmager (Storm 860, GE Healthcare). 



Crystallization and structure determination 

Ternary complexes were formed for APG-dCTP and 
APG-dATP by incubating poll protein (0.2 mM) and 
DNA in a 1:1.2 ratio with dNTP (5mM) and MgCl 2 
(5mM). Crystals of both complexes were obtained in 
15% PEG 5000 MME, 0.2 M (NH 4 ) 2 S0 4 , 2.5% glycerol, 
0.1 M MES (pH 6.5). Crystals were flash frozen in liquid 
nitrogen directly from dehydrated crystallization drops 
to prevent crystal cracking. X-ray diffraction data were 
collected at beamline 24-ID-E at the Advanced Photon 
Source in Argonne National Laboratory. All data were 
processed and scaled using HKL (26). 

Both structures were solved by molecular replacement 
using PHASER (27), with a previously solved ternary 
complex (PDB: 3GV5) (iota) as a search model. 
Structural refinement was performed using PHENIX 
(28), starting with rigid-body refinement, followed by 
restrained postional and B-factor refinement, and lastly, 
TLS refinement (29). Model building was performed using 
COOT (30), and figures were created using PYMOL (31). 

Modeling APG-polymerase complexes 

For modeling the APG lesion:dNTP conformations in 
human DNA poln and human polK, initial PDB struc- 
tures of 3MR2 and 20H2 were used for polr) and polK, 
respectively (17,18). Briefly, the poh:APG structures were 
superimposed with the structures of poln and pohc to 
install the APG substrate (DNA and dNTP) from poll 
into polr) and pobc. The positions of the adducted 
guanine containing replicating base pairs were slightly 
adjusted in the active sites of the original polr) and polK 
structues. For modeling APG extension in poh, the 
APG-dCTP and APG-dATP structures were used as 
starting models. The substrates (DNA and dNTP) in the 
complex structure were translocated as a rigid body from 
the insertion position down to the extension position. 
Then, the undamaged DNA and replicating base pair 
from a poll structure (PDB: 2ALZ) were used as a refer- 
ence to build up the replicating base pairs in the extension 
models. 



RESULTS 

Y-family polymerases bypass APG with different fidelities 

To characterize the bypass capability and mutagenic po- 
tential of human Y-family DNA polymerases across the 
APG lesion, we carried out primer extension assays using 
human poll, polr) and polK with either undamaged G or 
the APG lesion in the template DNA strand. Opposite 
undamaged G, the three enzymes extend the primer 
with varying efficiencies, but each polymerase preferen- 
tially incorporates the correct C nucleotide (Figure IB). 
Misincorporation bands are observed for all three 
enzymes, a trend noted with the Y-family DNA polymer- 
ases owing to their open and solvent-accessible active sites 
(15). In the presence of the APG lesion, all three enzymes 
display stalling of replication at the lesion site (strong 
stalling band indicated by horizontal arrows in 
Figure IB), with polr) showing the greatest ability to 
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bypass the lesion (multiple bands above the stalling band 
in the presence of all four nucleotides). The strong stalling 
bands in primer extension assays indicate the low effi- 
ciency of the DNA polymerases to extend past the APG 
adduct after single nucleotide incorporation opposite the 
lesion. Recent kinetic studies have revealed that polr|, 
polK and poh have a 6-, 5.7- and 7-fold decrease, respect- 
ively, in replication efficiency opposite the APG lesion 
compared with undamaged G, with polr) exhibiting the 
highest bypass efficiency, followed by pohc, which, in 
turn, was more efficient than poh (32). Although the effi- 
ciency of reaction is significantly reduced, all three DNA 
polymerases preferentially incorporate the correct C nu- 
cleotide opposite the APG lesion (Figure IB). Polr| and 
poh appear to have higher fidelities opposite the APG 
lesion relative to an undamaged G. In contrast, polK has 
lower fidelity for the lesion, and significant A and G 
misincorporations opposite APG occurred (Figure IB). 
Interestingly, the misincorporation of A by both poh 
and polK increased opposite the APG lesion compared 
with undamaged G. Indeed, kinetic experiments have 
revealed a 1.4- and 11 -fold increase in A incorporation 
opposite APG for poh and polK, respectively. 



The increase in A misincorporation is striking considering 
that the mutagenic signature of the APG adduct is G to T 
transversions induced by A mismatches. These results 
indicate that Y-family DNA polymerases are likely to 
cause mutations during cellular replication of the APG 
lesion. 

Poli-APG-dNTP ternary structures: conformational 
changes on APG binding 

To elucidate the mechanism of replication stalling and A 
misincorporation opposite the APG lesion by poh, we 
crystallized poh in complex with APG DNA incor- 
porating either dCTP or dATP nucleotides directly 
opposite the lesion. The DNA substrate for crystallization 
was designed so that the lesion was located directly down- 
stream to the primer-template junction, ready for 
dNTP incorporation (Figure 2A). The DNA substrate 
was incubated with poll and co-crystallized with either 
incoming dCTP or dATP nucleotide. The resulting struc- 
tures are denoted as APG-dCTP and APG-dATP, accord- 
ing to the identity of the incoming nucleotides in the active 
site. Both poh- APG ternary crystals diffracted to 2.9 A 
resolution (Table 1), which represent the first set of 
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Figure 2. Comparisons of poh-APG ternary structures. (A) DNA substrate used for crystallization and the positions of bases. Numbering is relative 
to the APG template at position 0. Vertical dashed line indicates axis of 2-fold symmetry. (B) Superposition of APG-dCTP (blue), APG-dATP (pink) 
and a previous poh ternary complex with undamaged G (PDB: 2ALZ, grey). Domains are labeled and arrows indicate domain movement relative to 
the undamaged G structure. The aminopyrene lesion is shown with incoming nucleotides and metal ions (green spheres). (C) Positioning of APG in 
APG-dCTP (blue) and (D) APG-dATP (pink) relative to undamged G (grey). View is looking from top, down through the DNA helix. Black arrows 
indicate backbone DNA movement, and grey block arrows indicate APG base movement. Major and minor groove sides of the DNA helices are 
labeled. 
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Table 1. Summary of crystallographic data 



Data collection 


APG-dCTP 


APG-dATP 




P6 5 22 


P6 5 22 


Mol/AU* 


1 


1 


Unit cell 






a, b, c (A) 


98.0, 98.0, 194.6 


98.9, 98.9, 194.2 


a - P'Y (°) . 


90, 90, 120 


90, 90, 120 


Resolution (A) b 


32.0-2.90 (2.95-2.90) 


32.1-2.90 (2.95-2.90) 


Unique reflections 


13253 


13 665 


Completeness (%) b 


97.2 (98.2) 


98.5 (98.8) 


Redundancy b 


7.0 (7.2) 


4.0 (4.2) 




29.4 (2.7) 


27.5 (2.7) 


n b 

"merge 


7.8 (63.3) 


6.2 (55.2) 


K_ennernent statistics 






R 1 R 
Jv work/ Jv free 




Z 1 


Number of atoms 






Protein 


2853 


2951 


nisi a 


JZU 


JZO 


UrN 1 i 


1Q 








j 


Water 


28 


28 


Average B factor 






Protein 


85.6 


89.7 


DNA 


88.8 


79.9 


dNTP 


111.4 


106.2 


Ions 


81.5 


82.8 


Waters 


62.2 


88.8 


R.m.s. deviations 






Bonds (A) 


0.007 


0.006 


Angles (°) 


1.17 


1.14 



"Mol/AU represents the number of molecules per asymmetric unit. 
b Data in the highest resolution shell are in parentheses. 
c One non-catalytic Mg 2+ ion exists in both structures. 



structures of a DNA polymerase replicating directly 
opposite a bulky pyrene lesion at the insertion stage. 
The poh-APG structures have the same crystal form as 
previously solved poh structures with undamged G (33); 
the asymmetric unit contains one poh and one-half of the 
DNA duplex from a self-annealed DNA oligonucleotide 
(see 'Materials and Methods' section). The complexes are 
both in productive conformations, with the a phosphate 
of dCTP/dATP in reach of the 3'-OH of the primer strand 
(Figures 2 and 3). Electron density is observed in the 
APG-dCTP structure linking the 3'-OH of the primer 
strand to the a phosphate, suggesting that some reaction 
intermediates or products have been generated in the 
crystal (Figure 3A). The 3'-OH group of the primer 
strand is 2.9 A from the a phosphate in APG-dCTP. 
However, in the APG-dATP structure, the majority of 
the complexes appear to be unreacted, with disconnected 
electron density between the 3'-end of the primer and the a 
phosphate, implying that the polymerase bypass of the 
APG lesion is difficult and inefficient (Figure 3B). The 
3'-OH group of the primer strand is ~3.5A away from 
the a-phosphate oxygen, a result of an unusual dATP base 
positioning (details in the next section), which likely 
reduces the efficiency of reaction. For both APG struc- 
tures, two magnesium (Mg) ions are observed in the 
active site at similar positions to an undamaged DNA 
poh ternary structure (2ALZ). Briefly, in the APG-dCTP 
structure, the B-position Mg ion is coordinated by the (3 



(3.0 A)- and y (2.5 A)-phosphate oxygen atoms, whereas 
the A position Mg ion is coordinated by the a (2.8 A)- 
phosphate oxygen and the primer strand's 3'-OH (3.2 A), 
as well as surrounding O atoms from poh and solvent 
(detailed bonding distance listed in Supplementary 
Figure SI A). For APG-dATP, the B-position Mg ion is 
coordinated by the p (2.7 A)- and y (2.4 A)-phosphate 
oxygen atoms, whereas the A-position Mg ion is 
coordinated by the a (2.4 A)-phosphate oxygen and the 
3'-OH group of the primer strand (3.2 A), along with 
other oxygen atoms from poll and solvent 
(Supplementary Figure SIB). 

The overall structures of two APG complexes look 
similar to each other and previous undamaged DNA- 
poh structures (Figure 2B). However, small differences 
are noted in the finger positioning, resulting from con- 
formational differences in APG and the incoming nucleo- 
tides in the active site. The front of the finger domain of 
APG-dATP is pushed up by ~2 A to accommodate dATP 
that is off the regular position of an incoming nucleotide 
(Figure 2B). It is noteworthy that significant conform- 
ational changes are observed in the little finger domains 
and DNA substrates in both APG complexes (Figure 2B), 
comparing the APG poll structures with a previously 
solved poh ternary complex with undamaged G (2ALZ) 
(33). The little finger domains have moved downward by 
~8° relative to the undamaged G structure, in response to 
the 5'-end of the template DNA moving toward the 
solvent-exposed major groove (Figure 2B). 
Consequently, the bottom of the DNA substrate has 
rotated toward the thumb domain by ~8° to accommo- 
date the shift in the little finger. The poh-APG structures 
share the same crystal form as previously solved poll 
structures (2ALZ) with undamged G, with the largest 
interface in the crystal between the protein and DNA sub- 
strate. These observations suggest that the domain and 
DNA movements are controlled by the complex structure, 
not the packing environment of the crystal lattice. The 
domain and DNA re-orientations in our APG structures 
are the result of the adjustments necessary for the bulky 
APG lesion to be accommodated within the poh active 
site. Such adjustments of the substrate and little finger 
domain positions have been previously observed with 
Dpo4, the model DNA polymerase in the Y family, as 
well as with yeast polr) and polK structures (18,20,21,34). 
This flexible adjustment of the little finger domain is a 
common structural characteristic of the Y-family DNA 
polymerases, as the little finger has loose connections to 
the rest of the polymerase core (20,21). The finger domain 
movement in APG-dATP, however, is not a common 
structural observaton. The active sites of Y-family poly- 
merases, mainly defined by the finger domains, have been 
observed in a pre-formed and rigid state (15,20). The 
finger of Dpo4 does not open up even when replicating 
double base lesions, such as CPD (TT dimer) and cis- 
platin-linked Pt-GG, which forces the two cross-linked 
bases to squeeze into the active site (35,36). However, 
our current poh structures are the first to show a product- 
ive bulky adduct DNA in the active site of a Y-family 
polymerase and possibly reveal a new structural plasticity 
within this polymerase family. 
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Figure 3. The poh-APG ternary complexes with zoomed-in replicating base pairs (A) APG-dCTP and (B) APG-dATP structures. The finger, thumb 
and palm domains are colored cyan as the 'core', while the little finger domains are colored light blue. The DNA is in yellow, the APG lesion in red, 
dCTP in blue and dATP in grey. Active site metal ions are shown as green spheres. Zoom-in views of the active sites are shown below the structures 
with the 2F„-F C electron density map contoured at la and in top views of the APG:dCTP and APG:dATP replicating base pairs. The APG lesion is 
in red, with sphere representation for the hydrophobic ring in the top views. 



APG adduct and incoming nucleotide: dNTP-induced 
APG conformational changes 

The conformation of the adducted G was dramatically 
altered compared with undamaged G in the poh active 
site (33) (Figure 2). Moreover, the conformation of the 
APG lesion was vastly different in the two APG 
complexes, which appears to be directly influenced by 
the identity of the incoming nucleotides. Undamaged 
template purines adopt syn conformations to form a 
Hoogsteen base pair with incoming nucleotides, orienting 
the C8 atom toward the protein-occluded minor groove in 
the poh active site (33) (Figure 2). Poh induces syn con- 
formations owing to a remarkably narrow active site that 
restricts the CI -CI' distance to <9A. A Watson-Crick 
base pair requires a Cl'-Cl' distance of ~ 10.6 A, and thus, 
this mode of base pairing is highly unfavorable in the 
narrow poll active site (24,33). Because the bulky APG 
ring is linked through the C8 atom of guanine, a syn con- 
formation would result in a clash of the bulky pyrene ring 
into the protein-occluded minor groove side if the 
modified G remains in the active site. To avoid the steric 



conflicts, the APG base in the APG-dCTP structure 
adopts an anti conformation different from a regular G 
template, with the APG ring placed in the solvent-exposed 
major groove (designated as an extra-helical conform- 
ation, Figures 2C and 3A). The APG base and DNA 
backbone are shifted out toward the major groove 
to achieve a standard Cl'-Cl' distance for anti- 
guanine:dCTP Watson-Crick base pairing (Figure 2C 
and 3A). This observation is consistent with the prediction 
that Watson-Crick base pairing could occur in the poh 
active site for a major groove adduct by a previous 
modeling study (37). Thus, our APG-dCTP structure il- 
lustrates how poh can accommodate the APG lesion in 
Watson-Crick base pairing, a mechanism different from 
Hoogsteen base pairing observed in all the previous purine 
template structures of poh (33). In this orientation, the 
hydrophobic aminopyrene moiety is positioned in the 
solvated major groove, with no direct interactions with 
either poh or the DNA substrate, generating a mobile 
pyrene ring with high B factors (~120A 2 ) compared with 
the rest the protein/DNA (~88A 2 ). Previous solution 
NMR experiments have revealed that the APG lesion 
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opposite C is favored in an intra-helical conformation 
within the DNA helix in the absence of a protein (38). 
Thus, poh must force APG into an energetically unfavor- 
able conformation to create a productive complex with 
dCTP, which likely contributes to the low reaction 
efficiency. 

In the APG-dATP structure, however, the APG-ad- 
ducted G base maintains a syn conformation but projects 
itself into the major groove, allowing the bulky pyrene ring 
to be placed in the replicating base pair position within 
the DNA helix (designated as an intra-helical conform- 
ation, Figures 2D and 3B). The bulky attachment in 
APG is sandwiched between the underlying base pair and 
the incoming dATP (Figure 3B). Interestingly, the NMR 
structure of a DNA helix containing the APG lesion 
opposite dA revealed an almost identical APG conform- 
ation with a syn glycosidic bond and an intercalated pyrene 
ring (39). Thus, intra-helical APG conformations represent 
a low-energy state of the lesion that stabilizes the hydro- 
phobic pyrene ring. Previously observed intercalated 
pyrene rings in complex with Dpo4 polymerase reveal 
primer strands and incoming dNTPs that are separated 
by up to 10 A and generate non-productive conformations 
for lesion bypass (40,41). The APG-dATP structure in 
this work provides the first intercalated pyrene ring in 
a productive complex with a Y-family polymerase. 
The intercalated pyrene ring occupies the space of the 
replicating base pair and prevents the incoming dATP 
from entering the active site in plane with the adducted 
template (Figure 3B). Consequently, the base of dATP is 
forced up one base pair position and stacks on top of 
the pyrene ring, while its phosphate moiety maintains a 
regular position as in other poh structures (Figures 2B 
and 3B) (24,33). No hydrogen bonds are formed between 
the dATP base and any template bases. Thus, dATP 
misincorporation is dictated purely by base-stacking inter- 
actions with the intercalated APG bulky ring (Figure 3B). 
The dATP base has the greatest stacking potential of 
all four nucleotides and thus would be most favored 
for stacking above pyrene rings (42). This observation 
is analogous to preferential incorporation of dATP 
on blunt-end DNA through stacking interactions by a 
Y-family polymerase (43). Accordingly, the intercalated 
APG ring mimics a base pair in the blunt end of a DNA 
helix and promotes A misincorporations. This observation 
also indicates how the APG lesion can induce frameshift 
mutations within the genome, a common occurrence with 
this type of DNA damage (44). Because the dATP stacks 
above the APG lesion, it is likely that base pairing could 
occur with template nucleotides above the APG base 
leading to —1 or —2 frameshift mutations. Thus, the 
strong stacking potential of the A base provides mechan- 
istic insight into abundant G to T transversions and frame- 
shift mutations (10). 

Structural basis of APG bypass fidelities in different 
Y-family polymerases 

To understand differences in APG incorporation specifi- 
city between different human Y-family polymerases, we 
modeled the poh APG base pairs into the active sites of 



polr) and pobc. Modeling indicates that polr) would be 
able to accommodate dCTP with APG in the extra-helical 
conformation (Figure 4A), but not the stacked dATP with 
APG in an intra-helical conformation (Figure 4B). The 
Arg 61 residue located on the lid of the finger domain 
makes the active site of polr| more restrictive than poh 
and would clash with the dATP base stacking over the 
pyrene ring (Figure 4B), preventing A base misinsertion. 
The residues on the lid of the finger domain contact the 
replicating base pair in the active site and control the sub- 
strate specificity of Y-family polymerases (15). The model 
provides structural insight into the low A misinsertion fre- 
quency of polr) opposite APG compared with the other 
human polymerases in our replication assays (Figure IB). 
To validate this structural model, Arg 61 of polr) was 
replaced with Ala by site-specific mutagenesis. This 
mutation causes a loss of fidelity opposite APG, with 
greatly enhanced A misincorporations (Figure 4F), but it 
does not reduce the fidelity opposite undamaged G 
(Figure 4E). Thus, the unique active site of polr) 
prevents A misincorporations opposite APG by inhibiting 
productive incoming nucleotide complexes with the 
intra-helical APG conformation. 

Modeling the APG lesion in pobc reveals that a mis- 
matched A base would be able to stack above the 
intercalated APG lesion similar to poh owing to small 
residues (Ala 150 and Ala 151) in the pohc's finger 
domain lid, which contacts the replicating base pair 
(Figure 4C and D). A slight shift up of the finger 
domain, similar to poh (Figure 2B), would enable polK 
to accommodate the intra-helical APG and stacked 
dATP in its active site (Figure 4D). In addition, polK 
has an additional N-clasp to cover the major groove 
near the active site that is fully exposed to the solvent in 
other Y-family polymerases. The N-clasp coverage may 
also contribute to the bulky lesion bypass by protecting 
the hydrophobic bulky lesion and bigger purine bases of 
incoming nucleotides from aqueous solvent. The model 
could explain the low fidelity of APG replication by 
pohc observed in the function assays, particularly for 
misinsertion of purine bases A and G with high stacking 
potentials against APG (Figure IB). Nevertheless, the 
finger domains of Y-family polymerases play an important 
role in replication fidelity and lesion bypass specificity, as 
the finger domains directly contact the replicating base 
pair in the active sites (15,24). Indeed, swapping finger 
domains between poh and Dpo4 Y-family polymerases 
results in exchanges of the fidelity and specificity of the 
enzymes (24). The sequence of the finger domain varies 
among Y-family polymerases, which generates unique 
active sites with different shapes, charge distributions 
and flexibilities, allowing for distinctive specificity and 
activity during DNA replication (24). 

APG:dC base pairing induces replication stalling 

To understand how human Y-family polymerases would 
elongate a primer strand after the APG lesion, we per- 
formed replication assays to extend the primer strand 
beyond APG paired with correct C or mismatched A at 
the primer-template junction. Poh efficiently extended 
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the primer strand from a regular G:C base pair, but 
was unable to extend from a lesion APG:C base pair 
(Figure 5C). Recent kinetic experiments have revealed 
that poll has ~4500-fold decrease in extension efficiency 
past the APG:C lesion base pair compared with undam- 
aged G (32). Interestingly, when an A nucleotide was 
mispaired with G/APG bases, a complete opposite exten- 
sion effect was observed: primer extension was inhibited 
by a G:A mismatch, whereas a lesion APG:A mismatch 
was efficiently extended by poh (Figure 5D). For APG 
lesion DNA replication, a similar trend of the APG:C 
stalling and APG:A extension was observed for polK 
(Figure 5G and H) and polr), albeit to a lesser extent for 
polr| (Figure 5E and F). The kinetic studies have shown an 
18- and 200-fold reduction in APG:C extension efficiency 
for polr| and polx, respectively, indicating that the diffi- 
culty in extending the APG lesion paired to the correct C 
nucleotide exists for all polymerases, although the magni- 
tude varies for different polymerases. To understand the 
differences in APG extension, we modeled APG:C and 
APG:A base pairs observed in our structures into the ex- 
tension position (-1 position) in poh. At the -1 position, 



the extra-helical APG lesion from APG-dCTP clashes 
with the little finger domain (Figure 5A). This struc- 
tural conflict would inhibit the translocation of the 
template DNA through the polymerase and block 
primer elongation. In contrast, the intercalated APG 
lesion from the APG-dATP structure can freely translo- 
cate into the -1 position, with no inhibitory protein inter- 
actions, because the APG has no structural conflicts with 
poh (Figure 5B). These results suggest that the incoming 
nucleotide-induced APG conformation plays a critical role 
in primer extension after the APG lesion. 

DISCUSSION 

NPAHs are highly abundant environmental pollutants 
with detrimental effects on human health. The structures 
presented here provide the first indication of how a human 
DNA polymerase may replicate directly opposite an 
NPAH-derived DNA adduct. The structures also reveal 
the mechanism of how a major NPAH-guanine lesion 
can induce A misincorporations. The hydrophobic ring 
systems of APG, and presumably other PAH/NPAH 
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lesions, have the ability to stack above the primer-template 
junction in an intra-helical conformation to achieve a 
thermodynamically stable structure. This conformation, 
in turn, allows the mismatched A nucleotide to stack 
above the ring system and be incorporated into the 
growing primer strand in a template-independent 
manner. After one more round of DNA replication, the 
A-mismatched base pair will result in the signature G to 
T transversion. Although the size of the NPAH ring system 
and different chemical attachments could alter the 
base-stacking potential in the DNA helix, the stacking 



mechanism of A misincorporation is likely to be common 
to all genotoxic PAH/NPAH DNA lesions. We speculate 
that a common mechanism of A misincorporation may 
explain the widespread mutagenic potential of PAH/ 
NPAH compounds, and their propensity to elicit 
carcinogenesis. 

Interestingly, A misincorporations opposite NPAH 
guanine lesions may provide cells with a survival 
advantage over correct C incorporation under certain 
pathological conditions. Our structural and biochemical 
studies have revealed that the mismatched A nucleotide 
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promotes primer elongation by stabilizing the NPAH ring 
system in the DNA helix, whereas the correctly matched C 
nucleotide induces replication stalling by projecting the 
NPAH lesion into the protein molecule by poh and 
polK. Thus, correct C incorporation may induce replica- 
tion fork stalling under certain situations, such as with 
non-functional polri. The stalling may contribute to the 
cellular toxicity observed for NPAH compounds (5). In 
contrast, the mismatched A nucleotide would likely 
promote replication fork progression by poll or pobc, 
allowing cell survival but leading to high rate of mutagen- 
esis (10). This type of error-prone lesion bypass is a 
hallmark of Y-family polymerase function and is an evo- 
lutionarily conserved mechanism that allows replication 
fork progession during times of cellular stress. 

One question that remains to be answered is which 
Y-family polymerase is the predominant enzyme for 
NPAH DNA lesion bypass. It is likely that multiple 
DNA polymerases, particularly translesional Y-family 
polymerases, participate in NPAH lesion bypass in mam- 
malian cells (45). A two-polymerase model has been 
proposed for DNA lesion bypass, in which one polymerase 
performs insertion and another performs primer extension 
for each lesion bypass event (46). From our biochemical 
experiments and previous work by other groups, it appears 
that poln is the predominant polymerase for 'error-free' 
APG lesion bypass. However, it is important to note that 
many cancers associated with exposure to air-borne car- 
cinogens, particularly lung and esophageal cancers, are 
either linked to the poh gene or have significant poh 
over-expression (47-49). Therefore, in pathological situ- 
ations where the proportions of Y-family polymerases are 
dysregulated, it is likely that an over-abundance of poh 
causes a higher frequency of NPAH lesions to be 
bypassed by error-prone poh rather than relatively error- 
free polr), leading to increased genetic mutations. Indeed, it 
has been shown that the high rates of mutations observed 
in xeroderma pigmentosum variant syndrome are due to 
poh taking over the bypass roles of poln (50). 
Furthermore, mutational burdern in breast cancer cell 
lines (which has been linked to aminopyrene exposure in 
animal models) is directly correlated to poh expression 
(51). Our structural observations combined with these 
lesion-induced mutagenesis results suggest that poh could 
play an important role in 1-NP-derived mutations in 
human cells. 

Although there are different carcinogenic NPAH and 
PAH compounds found in polluted air, the metabolites 
of these compounds all share three common characteris- 
tics of hydrophobicity, preferential attachment to guanine 
nucleotides and induction of G to T transversions at the 
site of the guanine adduct (52-56). Thus, the results pre- 
sented herein would likely be applicable to a wide variety 
of bulky DNA lesions and provide mechanistic insight 
into the health dangers imposed by this type of chemical 
air pollution. 
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